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Abstract 

We present a new Bayesian vision technique that aims at recovering a shape from 
two or more noisy observations taken under similar lighting conditions. The shape 
is parametrized by a piecewise linear height field, textured by a piecewise linear irra- 
diance field, and we assume Gaussian Markovian priors for both shape vertices and 
irradiance variables. The observation process, also known as rendering, is modeled 
by a non-affine projection (e.g. perspective projection) followed by a convolution 
with a piecewise linear point spread function, and contamination by additive Gaus- 
sian noise. We assume that the observation parameters (e.g. camera pose and noise 
variance) are calibrated beforehand. 

The major novelty of the proposed method consists of marginalizing out all the 
nuisance parameters (such as the irradiance field and the prior model hyperparam- 
eters), which is achieved by Laplace approximations. This reduces the inference 
to minimizing an energy- that only depends on the shape vertices, and therefore 
allows an efficient Iterated Conditional Mode (ICM) optimization scheme to be im- 
plemented. A Gaussian approximation of the posterior shape density is computed, 
thus providing not only an estimate of the mean geometry, but also of the uncer- 
tainty, which enables us to build a recursive algorithm to easily incorporate new 
data into the system. 

We illustrate the effectiveness of the general method described here by shape 
reconstruction results in a 2D case. A 3D version using the exact same approach is 
currently under development and aims at recovering a surface from stereo pairs or 
multiple images, directly reconstructing the topography by marginalizing out both 
albedo and shading effects. 
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Abstract. We present a new Bayesian vision technique that aims at recovering a shape from two 
or more noisy observations taken under similar lighting conditions. The shape is parametrized by a 
piecewise linear height field, textured by a piecewise linear irradiance field, and we assume Gaussian 
Markovian priors for both shape vertices and irradiance variables. The observation process, also 
known as rendering, is modeled by a non-affine projection (e.g. perspective projection) followed by 
a convolution with a piecewise linear point spread function, and contamination by additive Gaussian 
noise. We assume that the observation parameters are calibrated beforehand. 

The major novelty of the proposed method consists of marginalizing out the irradiances con- 
sidered as nuisance parameters, which is achieved by Laplace approximations. This reduces the 
inference to minimizing an energy that only depends on the shape vertices, and therefore allows an 
efficient Iterated Conditional Mode (ICM) optimization scheme to be implemented. A Gaussian ap- 
proximation of the posterior shape density is computed, thus providing estimates both the geometry 
and its uncertainty. We illustrate the effectiveness of the new method by shape reconstruction results 
in a 2D case. A 3D version is currently under development and aims at recovering a surface from 
multiple images, reconstructing the topography by marginalizing out both albedo and shading. 


1. INTRODUCTION 

In this work, we investigate the possibility of recovering a shape from a set of corrupted 
observations. To validate the proposed approach, we define the problem in a 2D space 
and perform experiments in this space. Once the approach has been clearly defined, and 
the model and algorithm choice justified (at least experimentally), the idea is to extend 
it to the 3D framework, more realistic. The final goal will be to recover a 3D surface 
geometry from multiple 2D images, in an efficient and robust way, without having to 
also infer the lighting and the spatially variable reflectance properties of this surface, 
which should be treated as nuisance parameters. 

In the plane, the problem is defined as follows. The surface is a Lambertian emitter 
defined by ai finite curve parametrized by a set of N v vertices v = {v fc }, and has an 
irradiance field attached to it, parametrized by a set of irradiance variables L— {L 3 }. 
There are at least 2 pinhole cameras (parametrized by the set © = {©*}: position, 
orientation, etc.) that record an intensity signal after projecting the irradiance onto a 
segment (in 3D it would be a perspective projection), see Fig. 1. The intensity is sampled 
after convolution with a sampling kernel, or point spread function (PSF), denoted by 
h. Then it is corrupted by a white additive Gaussian noise. Our goal is to provide an 
estimate of the geometry as well as the uncertainty related to this estimate. 


r 



FIGURE 1. Left: the geometry configuration in the object space, showing the 2 cameras. Right: the 
intensity recorded by each camera. The irradiance field is not shown here (we choose a smooth sine 
function with values between 0.2 and 0.8). 


To address this problem, we use a Bayesian framework [2], To simplify it and get an 
efficient and stable inference procedure, we make a few assumptions about the lighting 
scheme. We assume similar lighting conditions (as in a stereo setting), which enables 
us to consider a single irradiance field attached to the surface, which acts as a texture 
which is warped to form the observed intensities. Usually a surface is parametrized by 
an albedo field p and a reflectance function /, and the irradiance L is such that L oc pf. 
If we assume that we have similar lighting conditions between images, and that / does 
not depend on the angle between surface and camera, then / is constant from one image 
to another (though spatially variable), therefore we can re-parametrize by the irradiance. 

We can summarize our contributions as follows. First, we derive a model of the un- 
known surface by choosing an appropriate parametrization and efficient priors to stabi- 
lize the solution. Second, we show how to choose an appropriate discretization scheme 
by understanding the image formation process. Finally, as mentioned above, irradiances 
are used as parameters. We treat them as nuisance parameters and marginalize them out, 
deriving efficient approximations to remain computationally tractable. 


2. THE FORWARD PROBLEM 

2.1. Generative model and posterior distribution 

We assume that all the parameters are random variables governed by a joint proba- 
bility distribution. The relationships between all these variables are given as a graphical 
model in Fig. 2 (left), where each arrow represents a conditional density, and each leaf 






node a prior density. If an initial surface estimate is provided, the surface model is then 
given by this initial estimate and its uncertainty. Otherwise, if we have little knowl- 
edge about the surface geometry, we will just assume a smoothness prior. In all cases we 
choose to use a Gaussian distribution. The camera parameters follow a Dirac distribution 
around the value obtained by calibration (in our experiment, they are assumed known). 

The observations are assumed to be independent and corrupted by a zero-mean white 
Gaussian noise of variance o 2 . Therefore the conditional density of an observation 
given the surface and camera parameters is an iid Gaussian of variance a 2 and mean 
/(v,L, ©). The intensity I is synthesized from the surface (v,I) using the camera 
parameters © (this is also known as rendering and it is a deterministic process, described 
in Section 2.4). The likelihood of both surface and camera parameters is expressed as: 

= where U(y,L) = Tj-Jj (/,(v,£,©‘) - X' p ) 2 (1) 



FIGURE 2. The image formation model. Left: the multi-image graphical model. Right: the detailed 
model for one image (circles and rectangles respectively represent stochastic and deterministic processes). 


2.2. Surface parametrization and topology 

We parametrize the geometry by a set of coarse vertices {v^} using segments as 
shown on Fig. 3. To constrain the vertices, we parametrize them by a height field z(x): 
we have v fc = (x k , z k ), where the x k form a fixed uniform grid. 

Each segment is uniformly subdivided into N s sub-segments containing N s — 1 
aligned fine vertices {v- 7 } which define the irradiance field. The irradiance has a higher 
resolution than the geometry. There are multiple ways of choosing the irradiance model: 
it can be piecewise constant between the fine vertices (LP lies between 2 fine vertices), 
or piecewise linear (LP is defined on the vertex v- 7 ), as illustrated by Fig. 3. We will 
explain in Section 2.4 why a piecewise linear irradiance is preferred. 



Z/ piecewise constant model 


FIGURE 3. The geometry parametrization and topology (subdivided segments), and the 2 possible 
irradiance models (piecewise linear and piecewise constant). 














linear models for both irradiance and sampling is the only way to get a smooth en- 
ergy function, with continuous derivatives. For non-boundary vertices, linear irradiance 
seems to be sufficient: however if we consider vertices located at object boundaries tor 
occlusion boundaries), the model is discontinuous so we need a continuous sampling 
scheme to achieve the desired smoothness. 

Why is this smoothness so important? Because any deterministic optimization algo- 
rithm has little chance to converge to the global optimum if there are discontinuities 
in the derivatives (on the left figure, the local minima are quite obvious!). For computa- 
tional reasons, we do not intend to use stochastic techniques such as simulated annealing 
to get around local minima problems. 



FIGURE 4. Variation of the -log likelihood (C/+const.) as a function of one of the vertices (fixed 
irradiance). Left: non-boundary vertex, piecewise linear vs. piecewise constant irradiance model L. 
Right: boundary vertex, piecewise linear vs. piecewise constant sampling kernel h. 


3. THE INVERSE PROBLEM: BAYESIAN INFERENCE 

Computer vision, or model reconstruction from observations, can be seen as the inverse 
problem of rendering. Bayesian inference [2] is an efficient way to deal with such 
ill-posed inverse problems. In the Bayesian framework, model recovery becomes a 
parameter estimation problem (more precisely, we estimate parametric pdfs), which is 
achieved by using existing efficient optimization algorithms. 

Using Bayes’ rule and the graphical model of Fig. 2 (left) we get the joint posterior: 

P(v, L, {©'} |{jr}) oc P(y,L) n P(X\v t L, ©*) P(& 1 ) (6) 
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It is well-known in Bayesian inference that one should integrate the posterior over 
all unwanted, or nuisance parameters [1], to achieve a good stability: in our case, the 
camera parameters and the irradiances should be marginalized out. Stability is not the 
only reason to perform marginalization, as we will see in the algorithm section. Camera 
marginalization is rather simple since we assumed a Dirac distribution for 0. 




2.3. The geometry and irradiance priors 


To stabilize the solution, in our 2D experiments we use very simple and spatially 
invariant smoothness priors on both heights and irradiances, corresponding to first order 
Markov chains (limited to nearest neighbor interactions): 

P(v) = ^e“ a * (v) where $(v) = (z k+1 - z k f (2) 

A* k 

P(L ) = -^-e- a * (L) where $(L) = V (Z2 +1 - L j f (3) 

Za ^ 

More complex priors should be used if spatial adaptivity is required, or if discontinuities 
need to be modeled. Using efficient priors is important when dealing with missing or 
insufficient data (in real scenarios, some parts of the surface may be hidden) [4]. 

2.4. The deterministic image formation process 

We focus here on the rendering process, or how to obtain discrete pixel intensities 
from the set of vertices and irradiances in several steps (see Fig. 2 right). Each vertex is 
projected according to the following non-linear pinhole camera model (this is equivalent 
to a perspective projection in 3D), where u is the direction, T the location and a,b 
constants related to internal camera parameters: 

pi = a ^~j!^±2± + b with 0 = {u,T.F} (4) 

U x + Z ] VL z + T. 

If we have a linear irradiance model between fine vertices, the projected irradiance field 
C is assumed to be also linear, since the fine segments are small enough. The intensity 
for each pixel I p is obtained by first convolving C with a PSF h, then point sampling on 
a regular grid { p }: I p = ( £*h)(p ). 

The PSF can be decomposed as a discrete sum of translated sampling kernels A, 
so that I p is obtained by convolution with a fixed A, sampling and then a discrete 
convolution of the pixel values which can be taken out of the rendering. For simplicity, 
we ignore this last convolution and assume that h = A. 

Since £ is a linear function of { LP } , we rewrite the imaging equation as follows; 

Ir, — ^ X J p U where A p = function of P(v. 0) (5) 
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Then an important question arises: what kernel should we use? The simplest would be a 
normalized box function h(x) = 1 with x € [—0.5, 0.5]. However, we might need a more 
continuous one, such as the hat function h(x) = 1 — |rc| with x 4 [—1, lj. 

We have implemented the 4 possible combinations of piecewise linear and constant 
models for both irradiance and sampling kernel, and studied the behavior of the energy 
U(v,L, 0) relative to one of the images, when one of the vertex heights z k varies (all 
others being equal to their true value). Fig. 4 clearly illustrates the fact that choosing 



3.1. Irradiance marginalization, approximations 


We need to calculate the following integral to marginalize out the irradiance: 


P(vj{X }) = f P(v ■ L\{X'}) 

Jn 


e -U( V ,L)-WL) dL 


which can be achieved by using a Laplace approximation, assuming that the integrand 
can be well approximated by a multivariate Gaussian distribution. The integrand is pro- 


portional to the pc 


ior P(Z.jv, {X*}) (fixed vertices), which is a Gaussian distribution 


if we assume unbounded irradiances (since I p is linear in L, U is quadratic in L, and we 
also have a quadratic penalty <£>). For physical reasons, the irradiance is positive and 
bounded, so the distribution is not rigorously Gaussian. We will assume that the data 
constrains it to take values far enough from the bounds to ensure the validity of the 
Laplace approximation. Now we need to calculate the determinant of the covariance 
matrix [E], and the optimum (MAP) of L given the current geometry v, so that: 


/>(v|{JV})«P(v) 


-U(v±)-i3$(L) 


L(v) = argmaxP(L|v, {X 1 }) = argmin [ U (v,L) — /3$(L) 

Lj l 


-H7 r 


d 2 

p ~ dUdD 


U(v,L)-p<S>(L)] u 


Experiments have shown that the -log|E| compared to the term U(~v,L ) + /)$(!/) has 
negligible variations when changing vertex heights, so that it can be neglected when 
optimizing w.r.t. vertices (see Fig. 5 for an illustration). 



FIGURE 5. Solid line: marginal energy U' as a function of one of the vertices, computed with the 
proposed approximation; Dashed line: the normalizing constant that was neglected in this computation. 

The optimum can be computed by a diagonal Newton-Raphson descent algorithm; 
a few steps are sufficient to obtain a good convergence (see Section 3.2 for derivative 
computations). However, it is preferable to have a closed-form approximation of this 
optimum, so that the derivatives can be calculated as it will be mentioned in Section 3.2. 
We propose one that has a reasonable accuracy when the weights A are close to 0 or 1 
and there is little regularization. This gives good results in general, despite the absence 
of regularization, making the optimization itself useless in practice: 


LP(y) ~ 


XL G)' A 
XL(X 




Finally we propose the following approximation of the marginalized posterior, so that 
the entire problem reduces to minimizing the marginal energy U'(v ): 

P(vj{X 1 }) x e~ u (v> where tf'(v) = U (v,Z(v)) + £$ (l(v)) - logP(v) (12) 

We verify once again that we have made the right choice regarding both irradiance and 
sampling models: on Fig. 6 the oscillations related to piecewise constant models become 
quite obvious, and often lead to local minima, and in some cases a bias is noticeable (the 
global optimum is not the closest to the true value). For boundary vertices the linear 
sampling scheme is the only one to provide an acceptable marginal energy function. 
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FIGURE 6. Variation of the marginal energy as a function of one of the vertices. Left: non-boundary 
vertex, piecewise linear vs. piecewise constant irradiance model. Right: boundary vertex, piecewise linear 
vs. piecewise constant sampling kernel h. 



3.2. Computing the derivatives 


In order to achieve surface recovery via deterministic minimization of the marginal 
energy £/'(v), we need to compute the derivatives of this energy w.r.t. vertices. This 
requires us to compute the derivatives of the rendered intensity, of the optimal irradiance 
and of the prior penalty terms. To do this, we use the chain rule, accounting for the graph 
structure of all variables involved. For instance, let a vector C be a function of A through 
a set of variables {B 1 }; then the derivative matrices obey the chain rule: 


'5C 

dA 
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dA 


( 13 ) 


For the rendered intensity, given the graph structure in Fig. 2 (right) we get the following 
expression when the irradiance is a function of the vertices: 
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This derivative is involved in the first and second derivatives of the marginal energy: 





' dr 

dv k 


= ~r (r - x’ 


dii 


a- 


dv 


r fe 


■ Q' 


~a$(v) 

dv k 




l,p 


<9$(Z) 

dh 


dv 

d'v k 


' d 2 U' ' 

~ —V 

r«n 

\ dI V 

— 1 — 

"d 2 $(v)‘ 


' d 2 $(L)' 

'did 

dv k dv l 

“ <7 2 ^ 
i,p 

dv k _ 

_dv l \ 


dv k dv l \ 

i 

dv k dD 

dv 1 


(15) 


(16) 


where I p refers to the z-th intensity image rendered with ©\ We approximate the second 
derivatives of U 1 by neglecting the contribution of the second derivatives of the intensity 
and the optimal irradiance. Here we can see that we also need to compute the derivative 
of the optimal irradiance, hence the advantage of having a closed-form expression 
(function of v) such as the one in Eqn. (11). 


3.3. The surface recovery algorithm 


The goal of the reconstruction algorithm is to provide a Gaussian approximation of 
the posterior marginal P(v|{JW}). To achieve this, we need to compute the mode v of 
this distribution, which is equivalent to minimizing the energy V w.r.t. the geometry. 
Then we need a quadratic approximation of the energy around the optimum, hence the 
second derivatives (16) taken at v. The covariance matrix [E] provides an measure of 
the uncertainty on the geometry estimate v: 



d 2 U' 


dv k dv l 
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(17) 


Keeping the inverse covariance matrix at the end of the reconstruction enables us to build 
a recursive update algorithm, by using this Gaussian approximation as a prior for the 
next surface estimation process. This way, data can be added to the model sequentially, 
which can have multiple advantages. For instance the computational complexity is 
reduced since we do not need to use all the data at the same time. Furthermore, we can 
remove the restrictive assumptions about the lighting, by processing sets of images with 
similar lighting conditions simultaneously, and combining the different sets recursively. 

The proposed optimization algorithm is iterative, and at each step the rendering 
process is linearized around the current estimate v using the intensity derivatives: 

/ p (v,L,©) — I p ^v.L(v), ©j +Y^ 

k 

This makes U(v, L) a quadratic form in v. Since we choose quadratic penalty functions 
for the priors, U' is also quadratic in v. Moreover, it has a first order Markov structure 
when there are no occlusions (the dependence is limited to the first order neighbors). 
Therefore, optimizing this quadratic form is best achieved by an Iterative Conditional 
Mode (ICM) procedure, which benefits from the local dependence structure. Because of 
the very limited dependence, using a conjugate gradient in this case is clearly not a good 
choice, as our experiments have shown. In practice, given the weak dependence between 
vertices, an independent optimization turned out to be very efficient, and was achieved 
by using a diagonal quasi-Newton descent technique: 


dU' 

dv k 


(v fc - v fc ) 


(18) 




We also noticed that an accurate optimization of the quadratic form does not help 
increase the computation speed nor the result quality, since linearization is only a rough 
approximation. We did not observe any improvement from one to multiple iterations, 
therefore we do not recommend more than one step of descent before recomputing the 
local quadratic form around the updated geometry estimate. 

The proposed algorithm is summarized as follows: 

• Input: {X 1 } o 2 . 

• Initialization: v = constant (or a previous estimate if available); 

• Repeat until convergence: 

- Irradiance marginalization: compute Z.(v), Eqn. (1 1); 

- Intensity and irradiance derivatives: Eqn. (14), derivative of Eqn. (11); 

- First and second derivatives of U Eqns. (15)-(I6); 

- One step diagonal quasi-Newton update: Eqn. (19). 

• Inverse covariance estimation: Eqn. (17). 

• Output: geometry v = v, covariance [E -1 ]. 

The Bayesian approach enables us to automatically estimate the regularization param- 
eters q and (3, as well as the subdivision level N s (i.e. model selection, via Bayes fac- 
tors). This has been investigated in our experiments, but it is not described in this paper. 
Moreover, the segment subdivision level could be selected locally and dynamically (in 
the inner loop of the algorithm), depending on the current local geometry. 

4. 2D EXPERIMENT: RESULTS AND CONCLUSIONS 

The reconstruction algorithm has been successfully applied to the problem described in 
the introduction, and the results are shown on Fig. 7. We have o — 0.01 in this experiment 
(1% of the intensity range, since / € [0, 1]). The reader can evaluate the reconstruction 
quality by observing the estimated vertices and the error bars (corresponding to a 
marginal posterior probability greater than 0.1, computed using the estimated inverse 
covariance). The plot on the right shows the joint distribution of 3 pairs of vertices 
conditioned upon all others: only nearest neighbors interact and can therefore be strongly 
correlated (the correlation decays exponentially with the distance). The coarse segments 
are larger than pixels, otherwise there would be longer range interactions. This means 
that the inverse covariance matrix is sparse: it is possible to store it along with the 
surface estimate and to use it as a prior for subsequent inferences. 

The main conclusions of our 2D experiments are: 1) continuous models for both ir- 
radiance and sampling should be used to ensure the continuity of the energy functions; 
2) irradiance marginalization reduces the problem dimension from (N s 4- 1 )N V irradi- 
ances+vertices to N v vertices, also strongly reducing the interaction structure-, 3) we 
observed that marginalization also makes the energy landscape more quadratic , making 
the use of Newton-like techniques appropriate. 





FIGURE 7. The inferred vertex heights (10 iterations, parallel quasi-Newton optimization). Left: error 
bars = inferred geometry (probability >0.1), solid line = ground truth, dashed lines = the two camera 
configurations. Right: joint conditional posterior distribution of 2 vertex heights given all others (the 
indices are shown), illustrating the pairwise interactions. 


5. EXTENSION TO 3D AND FUTURE WORK 

To extend this promising approach to a more useful 3D framework, we use subdivided 
triangular meshes instead of subdivided segments, and 2D images instead of ID sig- 
nals. Hidden surface removal needs to be accurately performed (we assumed there were 
no occlusions in the 2D experiments). This is achieved through a recursive polygonal 
approach which subtracts triangles from an triangle/pixel intersection polygon [4], Ren- 
dering using continuous irradiance and PSF is made possible by computing the order-2 
moments of the visibility polygons [3]. In 3D, the goal is to first infer the scene geom- 
etry (3D equivalent of the method described in this paper), then to infer the albedo and 
reflectance functions using the estimated geometry (empirical Bayes approach). 

To achieve accurate surface recovery, designing and studying realistic prior models [4] 
is needed. Simultaneous reconstruction and camera calibration could also be addressed 
through marginalization. Finally, Bayesian model selection should be considered if a 
spatially adaptive mesh subdivision is needed. 
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